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Abstract 

The plant cell wall performs a number of essential functions including providing shape to many different cell types and 
serving as a defense against potential pathogens. The net pattern mutation creates breaks in the seed coat of soybean 
(Glycine max) because of ruptured cell walls. Using RNA-Seq, we examined the seed coat transcriptome from three stages of 
immature seed development in two pairs of isolines with normal or defective seed coat phenotypes due to the net pattern. 
The genome-wide comparative study of the transcript profiles of these isolines revealed 364 differentially expressed genes 
in common between the two varieties that were further divided into different broad functional categories. Genes related to 
cell wall processes accounted for 19% of the differentially expressed genes in the middle developmental stage of 100- 
200 mg seed weight. Within this class, the cell wall proline-rich and glycine-rich protein genes were highly differentially 
expressed in both genetic backgrounds. Other genes that showed significant expression changes in each of the isoline pairs 
at the 100-200 mg seed weight stage were xylem serine proteinase, fasciclin-related genes, auxin and stress response 
related genes, TRANSPARENT TESTA 1 (TTl ) and other transcription factors. The mutant appears to shift the timing of either 
the increase or decrease in the levels of some of the transcripts. The analysis of these data sets reveals the physiological 
changes that the seed coat undergoes during the formation of the breaks in the cell wall. 
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Introduction 

Plant cells are surrounded by cell walls which provide tensile 
strength, mechanical support, protection from insects and 
pathogens, prevention of water loss, and that participate in cell 
to cell communication [1,2,3,4]. The cell walls difiFer in 
composition in different plant parts but the basic composition of 
the cell waU includes cellulose, hemicellulose, pectins, and small 
proportions of structural proteins including proline-rich and 
glycine-rich proteins. The seed coat is the outer protective layer 
of a seed and develops from the integument originally surrounding 
the ovule and is maternal in origin. The seed coat provides 
protection of embryo and endosperm from mechanical injuries, 
insects, bacteria, and fungi, and desiccation of the seed. Seed coats 
of different species vary in structure and composition including 
extensive differentiation of cell layers into specialized cell types. 
These cells may accumulate large quantities of substances 
including mucilage or pigments depending on plant species. 

The seed coat, or testa, of the mature soybean [Glycine max) has 
been well characterized, and contains features in common with the 
majority of the legumes: an epidermal layer of pahsade cells, or 
macrosclereids, a sub-epidermal layer of hourglass cells, or 
osteosclereids, a few layers of parenchyma, and an aleurone layer 
[5,6,7] . Soybean seeds begin to form on the plant at the R4 stage 
when the parent plant has between 13 and 20 leaf nodes [8]. At 
this stage, the seed is going through multiple cycles of cell division 



as well as tissue dilferentiation. Invertase in the seed coat cleaves 
sucrose giving rise to hexose which surrounds the embryo of the 
minuscule soybean seed [9]. The seeds grow at a rapid rate 
between the R4 and R7 stages as they accumulate carbon, 
nitrogen, and seed storage proteins. Seeds increase in size from 
25 mg to 500 mg fresh weight during the R5 and R6 stages. 
Towards the end of the R6 stage, nutrient accumulation in the 
seed begins to decrease. By R7 the seed has amassed almost all the 
dry weight it wiU acquire and contains approximately 60% 
moisture and has begun to yellow. After reaching the maximum 
fresh weight, near 400-500 mg, the seed readies itself for 
desiccation and begins to lose total fresh weight. By the R8 stage, 
most of the pods and seeds have browned and are dry [8] . 

Intact seed coats are preferred but a mutation known as the "net 
pattern" in soybean results in defective seed coats that have 
ruptured by maturity to expose the cotyledons underneath. The 
trait has been backcrossed to create isohnes in two different genetic 
backgrounds; however, very little is known about the histology or 
molecular nature of the trait. Using RNA blots, we have previously 
reported a developmental delay in the decline of transcripts for a 
specific proline-rich protein (PRPl) of the cell wall leading to 
higher levels of this transcript in the defective seed coats at the 
middle weight range of 100-200 mg seed weight [10]. 

The effect on PRPl transcript levels was assumed to be a 
downstream effect of the net pattern mutation. In this report, we 
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present the global gene expression analysis of the net pattern cell 
wall mutant in soybean using transcriptome analyses of three 
different stages of seed development at 50-100 mg, 100-200 mg, 
and 400-500 mg. By comparing RNA-Seq data from the two lines 
containing the defective seed coat trait compared to their standard 
counterpart isohnes using Bowtie alignments [1 1] to the soybean 
genes models from the sequenced soybean genome [12], 364 
significantly differentially expressed genes in common between the 
two isoHnes were revealed, many of which were involved in cell 
wall processes. These data aid in determining perturbations in the 
physiological development of the plant cell wall which affects 
agronomicaUy important traits and is a source of bioenergy [13]. 

Results 

Phenotype Comparisons of Isoline Pairs with Standard 
and Defective Seed Coats 

The isolines in the background of cultivar Clark are referred to 
as Clark standard (CS) which exhibits an intact seed coat and 
Clark defective (CD) that contains the defective seed coat mutation 
that had been backcrossed into the Clark standard line. Likewise, 
Harosoy standard (HS) is compared to the Harosoy defective (HD) 
isoline that contains the defective seed coat mutation. The 
genotypes and phenotypes of these two isoline pairs is presented 
in Figure 1. As shown in Figure 1, all the layers of seed coat are 
ruptured in the mutant isolines. Genetically, mutant isolines 
contain an unknown gene or set of genes here designated as "Def ' 
for d(^ccti\ e seed coat. This abbreviation is not meant to convey 
an ofFicial locus designation. This mutation was transferred during 
the 196()'s by R. L. Bernard of die LISDA/ARS by selection of the 
trait during repetitive backcrossing into both a Clark (with a black 
seed background) and a Harosoy (yellow seed background). The 
defective seed coat phenotype is more striking on the black 
background and is sometimes referred to as a "net" pattern [10]. 
The phenotypes of the two lines along with their alleles controlling 
seed color [14,15,16,17] are shown in Figure 1 and Table 1. The 
two lines arc available from the L^SDA germplasm collection, 
Urbana, Illinois, as shown in Table 1. 

Transcriptome Comparisons of Clark and Harosoy 
Standard Seed Coats to Defective Seed Coats from the 
Mutant Isolines 

Next-generation high throughput sequencing of the transcrip- 
tome (RNA-Seq) was used and the cDNA libraries were 
constructed from seed coats from 50-100 mg, 100-200 mg and 
400-500 mg seed weight stages in standard and mutant isolines of 
Clark and Harosoy cultivars. Each library was sequenced on the 
lUumina GAII or HiSeq 2000 platforms. The number of raw 
sequence reads obtained from all four isolines, at different seed 
weight stages are presented in Table 1. The 75-bp or 100-bp reads 
were aligned to 78,773 target Glyma models from the soybean 
reference genome [12] using Bowtie [11]. The RNA-Seq data 
were normalized in reads per kUobase of gene model per million 
mapped reads (RPKM) considering gene length as parameter [18]. 

As presented in Table SI, 18% and 50% of the genes were 
expressed in the seed coat of Clark and Harosoy isolines at the 
cutoff points of above 10 RPKM and 1 RPKM, respectively. For 
inclusion in further analyses, all genes that showed ^5 RPKM in 
either standard or defective isoline as well a >2 fold differential 
expression between standard and defective isolines were included 
in further functional annotation analyses. An overview of the 
number of these genes is presented in Table 2 which shows that a 
total of approximately 1,300 out of 78,773 genes in the soybean 
genome showed differential expression in the seed coats of either 



the Clark or Harosoy isolines in at least one stage of development. 
The p-values for differential expression were obtained by analysis 
with the DESeq package on hit count data from comparisons 
between Clark and Harosoy standard and defective isolines. The 
gene models that showed p-values ^0.05 were considered as 
significandy differentially expressed. These differentially expressed 
genes along with gene model numbers, annotations, RPKM 
values, basemean, and p-values are presented in Tables S2-S4. 
The graphical presentation of expression levels of these differen- 
tially expressed genes in standard and defective isolines are 
presented in Figures SI & S2. The majority of the differentially 
expressed seed coat genes were expressed at less than 50 RPKM in 
both Clark and Harosoy backgrounds. 

Functional Classification of Genes Differentially 
Expressed in Seed Coats of both Clark and Harosoy 

Isolines 

As shown in Table 2, a total of 364 genes were differentially 
expressed in at least one of the three different seed weight stages in 
both Clark and Harosoy isolines. The detailed information on 
these genes is presented in Table S3 and distribution of these genes 
into different broad functional categories is presented in Figure 2. 
Out of these 364 genes, there were 265 that showed significandy 
difiFerent expression in the 1 00-200 mg weight range while only 5 7 
and 42 genes varied in the 50-100 and 400-500 mg weight 
ranges, respectively. A representative selection of 29 genes is 
presented in Tables 3 and 4. In addition, 248 of these 265 varied 
only at the 100-200 mg stage and did not show differential 
expression at the 50-100 or 400-500 mg weight range. Thus, the 
maximum number of significantly differentially expressed genes in 
common in both the Clark and Harosoy backgrounds are in the 
100-200 mg seed weight range. In the 50-100 mg weight range, 
there were no differentially expressed genes with cell wall 
annotations (Figure 2 A), while at the 100-200 mg and 400- 
500 mg weight range there were 19% and 5% of the differentially 
expressed genes related to possible cell wall functions, respectively 
(Figures 2B and 2C). The 51 cell wall related genes at the 100- 
200 mg stage were further divided into 10 difiFerent categories 
based on cell wall structural components (Figure 3). The 
maximum number (21"/o) of genes fall under the fasciclin-like 
arabinogalactan proteins that have been reported to affect tensile 
strength of the cell wall [19]. Proline-rich proteins are the second 
most abundant category at 15%, while cystcine-rich proteins, 
expansin/extensin-related proteins, and glycine-rich proteins 
(GRPs) represent 10%, 8%, and 4% of the differentially expressed 
cell wall proteins in this developmental stage (Figure 3). 

A selection of 29 of the 364 differentially expressed genes at the 
100-200 mg seed weight stage are shown in Table 3 (in Clark 
background) and Table 4 (in Harosoy background) along with 
their functional annotations, fold-changes in differential expres- 
sion, and p-values. The data for all 364 genes are shown in Table 
S3. Six genes in Tables 3 and 4 have cell wall structural protein 
annotations, including proline-rich proteins PRPl and PRP2 and 
glycine-rich proteins. 

Cell Wall Related Genes Expressed in Both Clark and 

Harosoy Seed Coats 

As presented in Tables 3 & 4, PRPl and PRP2 were two of the 
highly differentially expressed prohne-rich protein genes in both 
Clark and Harosoy isolines. In the Harosoy isolines, both the 
RPKM levels and fold differences for these genes were higher as 
compared to Clark isolines. The expression of the PRPl was 
higher earlier in development and declined as shown in Figure 4A. 
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Standard Line (HS) Defective Line (HD) 

yellow, Irt yellow (Def.), IrtDef 

Figure 1. Comparisons of tlie Defective Seed Coat iVIutation in Soybean. The genotypes and phenotypes of the isolines used in both Clarl< 
and Harosoy bacl<grounds are shown. Representative seed comparing Clarl< Standard (CS) vs. Clarl< Defective (CD) and Harosoy Standard (HS) vs. 
Harosoy Defective (HD) illustrate the defective seed coat mutation exhibiting cracks on seed coat in either a pigmented or non-pigmented genetic 
background. Refer to Table 1 for additional information. 
doi:1 0.1 371 /journal.pone.0096342.g001 



In contrast, PRP2 was expressed only at the later stages of seed 
development as seen in Figure 4B. The differential expression 
between the standard and defective seed coats was maximal for 
both PRPl and PRP2 at the 100-200 mg weight range in Clark 



and Harosoy isolines. PRPI transcripts appeared to decline in 
abundance faster in the standard seed coats of both Clark and 
Harosoy isolines leading to higher levels in the defective seed coats 
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Table 2. Total Number of Differentially Expressed Genes in the Seed Coats of the Clark and Harosoy Isollne Pairs at Different Seed 
Weight Stages. 





Genotypes 


Number of differentially expressed genes 








50-100 mg 


100-200 mg 


400-500 mg 


Total 


Clark 


720 


173 


417 


1310 


Harosoy 


48 


156 


1068 


1272 


Both 


57 


265 


42 


364 


Opposite 


9 


4 


96 


109 



The number of differentially expressed genes in Clark or Harosoy standard versus the defective seed coat isoline that meet the criteria of ^2 fold differential expression, 
>5 RPKM and p-value <0.05 by DESeq analysis. Both: the number of differentially expressed genes shared in common between both the Clark and Harosoy Isoline pairs 
and also the expression levels of these genes are in the same direction i.e. if expression of a particular gene was higher in Clark wild-type and lower in the defective 
isoline, it was also higher in Harosoy wild type isoline and lower in Harosoy defective isoline. Opposite: If a particular gene showed higher expression in the standard 
isoline of Clark compared to the Clark defective, but it showed higher expression in the defective Harosoy compared to the standard Harosoy line. 
doi:1 0.1 371 /journal.pone.0096342.t002 



whereas PRP2 transcripts were always less abundant in the 
defective seed coats. 

The expression levels of PRPl and PRP2 at different seed 
weight stages were previously demonstrated by [10] using RNA 
blots in the Clark isoline. Our previous RNA blots agree with our 
present RNA-Seq data showing that PRPl was expressed more 
abundantly in early development, whereas PRP2 was present in 
the older seed coats. These two Glyma models were called as "low 
confidence" gene models with no annotation by the Glyma gene 
model version 1.0 at the Phytozome database and as extensin- 
related genes in Glyma model version 1.1, whereas the non- 
redundant database of the National Center for Biotechnology 
Information (NCBI) annotated them as proline-rich proteins. We 
conducted manual annotations in the following manner to confirm 
them as the classically defined soybean PRPl and PRP2 genes 
which are in NCBI. The nucleotide and amino acid sequences of 
PRPl from NCBI (J02746) showed 100% match with Gly- 
ma09gl 2200.1 from the soybean genome (Figure S3), and the 
nucleotide sequence of SbPRP2 gene from NCBI J05208) showed 
98.8% similarity with Glyma09gl 2260.1 (Figure S4). There was a 
difference of 10 amino acids between the amino acid sequence of 
PRP2 in cultivars Wayne (amino acid sequence from NCBI) and 
Williams (amino acid sequence from the soybean genome). Both 
PRPl and PRP2 genes are linked and separation between these 
genes is approximately 146 kb (Figure S5). 

The graphical representations of the expression levels of the 
other difiFerentiaUy expressed genes shown in Tables 3 and 4 at all 
three different stages of seed development is presented in Figures 
S6 and S7, respectively, showing that all have the highest 
differential expression at the mid- developmental stages of 100- 
200 mg seed weight. Included is the pattern for several other 
proline and glycine-rich proteins and that of the xylem serine 
proteinase that is overexpressed in the defective seed coats (Figure 
S6-M). In many of these genes, those that are highly expressed in 
the 50-100 mg seed coats and then decline show a pattern similar 
to PRPl in that they are overexpressed in the defective seed coats. 
Likewise, those that are overexpressed in the standard line are 
similar to PRP2 in that they generally have higher RPKM 
expression levels later in development. 

Figure 3 shows that 2 1 % of the cell wall related genes were 
annotated as fasciclin related. The graphical view of expression of 
these fasciclin-like arabinogalactan genes in the seed coats of both 
standard and defective seed coat isolines at three different stages of 
seed development is presented in Figure S8. AH of the genes 
showed expression patterns with significant differential expression 



at 100-200mg seed weight stage and their transcript levels were 
higher in the defective as compared to the standard seed coats, as 
was the case with PRP 1 . Thus, all 11 of these genes were 
overexpressed in the defective seed coat. Figure S9 graphs the 
developmental profiles of the remaining 25 cell wall genes 
overexpressed in the defective isolines and Figure SIO shows the 
remaining 3 cell wall genes that were overexpressed in the 
standard seed coats. 

It is clear from the 68 graphs of the gene expression patterns 
shown in Figures S6-S10 that the 100-200 mg weight range 
shows the highest differential expression and that many of the 
genes can be classified as either declining in expression or 
increasing in expression during seed coat development as is the 
case for the PRPl and PRP2 patterns. In many cases, the net 
pattern mutation appears to increase expression of many of the ceU 
wall related genes in the 100-200 mg weight range as for PRPl . In 
contrast, many of the genes that are overexpressed in the standard 
line appear to be increasing during late seed development in both 
standard and defective lines as does PRP2. 

Hormone Regulated Genes which were Differentially 
Expressed in Both Clark and Harosoy Backgrounds 

There were 2 1 hormone regulated genes that were differentially 
expressed in the seed coat of wild type and defective isolines of 
both Clark and Harosoy background (Table S3) including 
representative auxin and gibbereUin regulated genes that are 
shown in Tables 3 & 4. One gene model Glymal0g35870.1 
showed the best match to the authentic ADR 12 (auxin down 
regulated) gene from NCBI (S58482). It showed approximately 16- 
fold lower expression in the Clark defective seed coats and 4-fold 
lower expression expression in Harosoy defective seed coats 
(Tables 3 & 4). Auxin is an important phytohormone that plays 
several important roles in plant growth and development. The 
expression level of this gene at different stages of seed development 
in both standard and defective seed coat isolines is presented in 
Figure 5A for RNA-Seq data. In the Clark standard isoline, this 
gene showed higher expression in earlier stages, and there was 
continuous decrease in the expression level at later stages of seed 
development. There was significant difierential expression at all 
the seed developmental stages (Figure 5A, left). In the Harosoy 
standard isoline, expression of ADR 12 increased continuously 
with developmental stages and also increased in the defective seed 
coats at the two older stages, although the levels were significandy 
lower in the defective seed coats (Figure 5A, right). The expression 
pattern of ADR 1 2 using RNA blots (Figure 5B, top) coincides well 
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Figure 2. The Distribution of Genes Differentially Expressed in Standard and Defective Seed Coats of BotKi Claris and Harosoy 
Backgrounds During Three Stages of Immature Seed Coat Development. The genes that were differentially expressed in both Clark and 
Harosoy backgrounds at seed weight ranges of either (A) 50-100 mg (B) 100-200 mg and (C) 400-500 mg. Differential expression is defined as genes 
with >5RPKIVI expression, >2 fold differential expression, and p value £0.05. Differential expression of cell wall related genes comprise 19% of the 
total genes at the 100-200 mg seed weight stage and 5% of the 400-500 mg stage. 
doi:1 0.1 371 /journal.pone.0096342.g002 



with the RNA-Seq data in tliat the defective seed coats have 
reduced expression especially in the Clark isolines. In addition, the 
blots confirm that the ADR 12 is more highly expressed in 



standard seed coats of Clark than in Harosoy. As presented in the 
blots in Figure 5B, bottom panel, there was no differential 
expression observed from 4-5 days old soybean hypocotyls in 
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others 
12% 




Figure 3. The Distribution of Cell wall Related Genes Differentially Expressed Between Standard and Defective Seed Coats in Both 
Clark and Harosoy Backgrounds at the 1 00-200 mg Seed Weight Stage. The distribution of 51 differentially expressed genes into different 
cell wall structural component categories is shown with genes having annotations related to fasciclin (21%) and protein rich proteins (15%) 
predominating. 

doi:1 0.1 371 /journal.pone.0096342.g003 



standard and defective isolines of Clark and Harosoy. Thus, the 
differential expression of tliis gene is likely confined to the seed 
coats that manifest the defective cell wall structure. 

The expression pattern of two other hormone regulated genes in 
Tables 3 and 4 that showed significant differential expression in 
the seed coat of wild type and defective isolines of both Clark and 
Harosoy backgrounds is presented in Figures S6-L and S7-N. The 
gene that showed a gibbereUin regulated annotation had an 
expression pattern similar to PRPl with higher expression in 
defective isolines as compared to the standard seed coats at 100- 
200 mg seed weight (Figure S6-L), while the other gene that had 
an auxin related annotation showed a different expression pattern 
with under expression in the defective isoline at the 100-200 mg 
seed weight stage (Figure S7-N). 

Genes Differentially Expressed between Standard and 
Defective Isolines in Either the Clark or Harosoy 
Backgrounds 

Along with the genes that showed significant differential 
expression in the seed coats of both genetic backgrounds, there 
were genes that showed significant differential expression in either 
the Clark or Harosoy isolines. As presented in Table 2 and Table 
S2, there were approximately 1300 genes that showed differential 
expression in either the Clark or Harosoy background. These 
genes were divided into different broad functional categories based 
on functional annotations and presented in Figures S 1 1 , S 1 2 and 
SI 3. The cell wall category is highest at the 100-200 mg stages 
and comprises 14% in Clark and 44% of the total differentially 
expressed genes in Harosoy seed coats. Another major category 
was transcription factor and transcription related genes (up to 
11%). There were 109 genes with functional annotations that 
showed opposite differential expression [ie., if expression of a 
particular gene was higher in the standard isoline of Clark, then it 
was lower in the standard isoline of Harosoy) (Table S4). 



Transcription Factor Genes that were Differentially 
Expressed in Standard and Defective Seed Coat Isolines 

Two zinc finger domain (C2H2 type) containing protein genes 
that had TTl (TRANSPARENT TESTA 1) annotations were 
approximately 6-fold and 15-fold more highly expressed in the 
defective seed coats of the Clark and Harosoy lines, respectively, at 
the 100-200 mg seed weight stage (Tables 3 & 4). The expression 
level of these transcription factor genes at diflferent stages of 
development is presented in Figure 6. These genes showed higher 
expression at the younger stage in both standard and defective 
isolines in the Clark background and then decreased with 
progressive seed weight stage. They increased somewhat in the 
Harosoy line at the 100-200 mg seed stage before declining. 

Along with the two TTl gene models, there were 14 total 
transcription factor genes that showed differential expression in 
seed coats of standard and defective isolines in both of the Clark 
and Harosoy backgrounds at the 100-200 mg stage (Figure 7, 
Table S3). The most prominent category had NAC-related 
annotations. The other important classes of transcription factor 
genes were bHLH, zinc finger and ethylene responsive factor 
genes. The expression of these transcription factor genes at three 
stages of seed development is presented in Figure S14 and SI 5. 
There were two transcription factor genes (Myb and bHLH) that 
showed expression patterns similar to TT 1 and PRP 1 (Figwe S 1 4 
A &. C). Four of the other transcription factor genes showed 
expression patterns similar to PRP2 (Figure S15 A, C, E & H). 
Interestingly, there were no significantly differentially expressed 
transcription factor genes found in common at the 50-100 mg or 
400-500 mg seed weight stages in both backgrounds. 

Table S2 contains the data for transcription factors that were 
differentially expressed between standard and defective seed coats 
in only the Clark or the Harsoy background. At the 50—100 mg 
seed weight stage, there were 59 transcription factor genes that 
showed differential expression in the seeds coats of only the Clark 
isolines with the maximum number of genes being related to the 
zinc finger category (Figure SI 6) whereas in the Harosoy isolines 
at the 50-100 mg seed weight stage, Glymal5g42810. 1, was the 
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Figure 4. The Expression Levels of (A) PRP1 and (B) PRP2 Genes in Standard and Defective Seed Coats of eitKier Claris or Harosoy 
Backgrounds at Tliree Stages of Seed Development. The blue line in the graphs represents expression level (RPKM) of these genes in seed 
coats of standard isolines and the red line represents expression levels in the defective seed coats at the 50-100 mg, 100-200 mg, and 400-500 mg 
seed weight ranges. Maximal differential expression of the genes between the standard and defective isolines was observed at the 100-200 mg seed 
weight stage. (Abbreviations: CS: Clark standard, CD: Clark defective, HS: Harosoy standard, HD: Harosoy defective). 
doi:1 0.1 371 /journal.pone.0096342.g004 



only transcription factor gene that showed difTerential expression. 
Likewise, Figure SI 7 dehneates the functional annotations of the 
1 6 and 1 2 transcription factors that showed differential expression 
in the Clark background at the 100-200 or 400-500 mg stages, 
respectively, and Figure S18 shows the 43 and 111 transcription 
factors that were dififerentially expressed only in the Harosoy line 
at the later stages of development. 

Discussion 

Many Cell Wall Related Proteins are Affected by the Net 
Pattern Defective Seed Coat Mutation 

We provide an overview of the genes that are affected by the 
"net pattern" mutation which affects the structural integrity of the 
soybean seed coat. As shown in Table 2 and Table S3, a total of 
364 genes showed significant differential expression in both Clark 



and Harosoy isolines that contain the defective seed coat mutation. 
Approximately 1 5 % of the differentially expressed genes showed 
cell wall related annotations including proline-rich proteins, 
glycine-rich proteins, fasciclin-like arabinogalactan proteins, ex- 
pansins, and extensins. As reported earlier, these are the major 
classes of proteins that provide tensile strength and structure to the 
cell wall [19,20]. Xyloglucan [21], xylulose [22] and pectin had 
also been reported as structural components of the cell wall 
[23,24,25,26]. Structural cell wall GRPs have glycine contents up 
to 60-70% [27,28,29,30]. In our study, genes with glycine-rich 
protein annotation had glycine content from 11-40% in their 
amino acid sequence. 

Allergens, thaumatin-like proteins, and cysteine-rich domain 
containing proteins have also been reported to play important 
roles in various functions related to the cell wall [31,32,33]. The 
genes with these annotations were differentially expressed in 
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Figure 5. The Expression Levels of ADR12 (Glyma10g35870.1) Transcripts in Standard and Defective Seed Coat Isolines as 
Determined by RNA-Seq and RNA blots. (A) The levels of ADR1 2 (ADR1 2-2) transcripts (as RPKM) in seed coats of Clark (left) or Harosoy (right) 
isolines. The blue line in the graphs represents the expression level (RPKM) of this gene in seed coats of standard isolines and the red line represents 
expression levels in the defective seed coats at the 50-100 mg, 100-200 mg, and 400-500 mg seed weight ranges. (B, Top) The Northern blots of 
RNA from 100-200 mg soybean seed coats representing three independent extractions of the Clark and two independent samples of the Harsoy 
lines. (B, Bottom) The Northern blot of RNA from 4-5 days old soybean hypocotyls extracted from two independent samples of each line. Each gel 
lane was loaded with 10 ug of total RNA, electrophoresed, stained with ethidium bromide to confirm equal loadings, and blotted. The RNA blots 
were probed with Gm-rl 088-6964, a EST clone representing ADR12 and corresponding to Glyma10g35870.1. Whereas there is differential expression 
in the seed coats of standard and defective lines, there was no difference in expression of ADR12 in the hypocotyls. (Abbreviations: CS: Clark 
Standard, CD: Clark defective, HS: Harosoy standard, HD: Harosoy defective). 
doi:1 0.1 371 /journal.pone.0096342.g005 



standard and defective isolines in our study and may be involved in 
the appearance of cracking on the seed coat. Xylem serine 
proteinase! was one of the significantly differentially expressed 
genes at the 100-200 mg seed weight stage in this study. In 
Arabidopsis, this enzyme has been reported to be involved in 
xylem formation [34]. 

While we have concentrated on the 51 cell wall related genes 
that were significantly differentially expressed in both varieties, we 
note that there are 920 additional cell wall related genes that did 
not show differential expression in the seed coats of either of the 
two isoline pairs. Thus, there are many processes related to the cell 
wall that are not affected at the transcript level by the net-pattern 
mutation. 



Hormone Regulated Genes Affected by the Net Pattern 
Mutation 

Auxin is a plant growth regulator that is required for cell 
expansion, division and differentiation. Auxin pathway related 
genes have been reported to regulate several cell wall related genes 
in Arabidopsis [35,36]. Likewise gibberellins had been reported to 
alter the expression of cell wall loosening genes and to work in 
coordination with auxin and abscisic acid hormone pathways [37]. 
Genes related to the jasmonic acid pathway and ethylene response 
may also be related to cell wall formation [38,39]. As presented in 
the results (Tables S2-S4), there were differentially expressed 
genes in the standard and defective isolines that showed ethylene 
response related, auxin and gibberline hormone related/ regulated 
annotations suggesting an effect of the seed coat mutation on these 
genes. The role of the auxin downregulated ADR 12 transcript that 
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Figure 6. The Expression Levels of TRANSPARENT TESTA 1 (TT1 ) Genes in Standard or Defective Seed Coats of Clarl< and Harosoy 
Backgrounds. Botin Glyma14g35140 (Figure A) and Glyma13g01720 (Figure B) gene models had TT1 annotations. The blue line in the graphs 
represents expression level (RPKM) of these genes in seed coats of standard isolines and the red line represents expression levels in the defective seed 
coats at the 50-100 mg, 100-200 mg, and 400-500 mg seed weight ranges. There was significant differential expression in these genes at 100- 
200mg seed weight range. (Abbreviations: CS: Clark standard, CD: Clark defective, HS: Harosoy standard, HD: Harosoy defective). 
doi:1 0.1 371 /journal.pone.0096342.g006 
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Figure 7. The Annotations of Differentially Expressed Transcription Factor Genes in Standard and Defective Seed Coats in Common 
Between Clark and Harosoy Backgrounds at the 100-200mg Seed Weight stage. The classification is shown of 14 significantly differentially 
expressed transcription factor genes (a5RPKIVI, >2 fold differential expression and p-value £0.05) at the 100-200 mg seed weight stage that are in 
common between both the Clark and Harosoy backgrounds. 
doi:1 0.1 371/journal.pone.0096342.g007 



PLOS ONE I www.plosone.org 



12 



May 2014 | Volume 9 | Issue 5 | e96342 



RNA-Seq Profiling of Defective Seed Coat Mutation 



showed high differential expression in the standard and defective 
seed coats but not in the hypocotyls (Figure 5) is not known. The 
predicted peptide form the ADR12 transcript is small at only 71 
amino acids. Likely, many of the transcript differences for the 
affected pathways are confined to the seed coats as is the case for 
the ADR 1 2 transcript. Such an extensive disruption of cell wall 
metabolism by the suite of genes affected in the net pattern 
mutation that leads to producing tears in the walls would likely be 
detrimental to the growth of the seedlings but is tolerated in the 
latter stages of seed coat development. 

Transcript Abundance and Levels of Soluble Proline-Rich 
Proteins are Reduced in the Defective Seed Coat 

Mutation 

The proline-rich proteins are composed of small tandem repeats 
such as PPVYK or PPVEK, where the second proline is often 
hydroxyproline [40]. In our study PRPl and PRP2 were among 
the highly differentially expressed genes (Tables 3 and 4). PRPl 
was previously identified and characterized based on cDNA and 
amino acid sequence [41,42,43,44]. Likewise, PRP2 had also been 
characterized as a slighdy smaller protein than PRPl, and both 
consist essentially of the repeating decamer PPVYKPPVEK [45] . 
Gene model numbers corresponding to these genes were obtained 
from the Phytozome database that has genomic sequence from the 
Williams cultivar of Glycine max [46] . The Williams PRP2 protein is 
smaller when compared with the PRP2 in the cultivar Wayne from 
which it was originally sequenced as presented by amino acid 
sequence alignment in Figure S4. This corresponds with another 
study showing that in some soybean varieties both PRPl and 
PRP2 proteins are smaller because of in frame deletions in the 
coding region of units of the tandem decamer repeats. In previous 
reports, PRPl and PRP2 had also been reported to be linked, but 
separated by approximately I'.VAt recombination units [47]. A 
BLAST search for these genes in Phytozome showed that they are 
located on chromosome 9, approximately 146 kb apart (Figure 
85). 

As described in our results, the expression of PRPl was 
dramatically higher in the yellow Harosoy isolines (homozygous 
for / alleles), as compared to black Clark isolines (homozygous for 
the i allele) (Figure 4). Likewise, another study showed that the / 
locus which controls inhibition of anthocyanin accumulation in the 
epidermal cells of the soybean seed coat also affects abundance of 
PRPl mRNA and protein in the seed coat. Interestingly, an 
epistatic interaction between the recessive i and t alleles also causes 
cracking of the pigmented seed coat [48] . This seed coat cracking 
is not related to the defective seed coats of the net pattern 
described in this report which is independent of seed color as 
shown in Figure 1. However, this genetic interaction implies an 
interaction of the flavonoid pathway with cell wall structure as the t 
locus is known to encode a flavonoid 3' hydroxylase [17]. 

Our RNA-Seq data (Figure 4) show excellent agreement with 
the previous RNA blots [10] indicating a delay in the decline of 
PRPl transcripts in the defective seed coats leading to higher levels 
in the defective seed coats at the middle weight range of 100- 
200 mg. Despite the presence of significant levels of PRPl 
transcripts in the defective seed coats, no PRPl protein was 
detectable b)' immunoblotting in defective Clark seed t:oats at any 
stage of seed development but it was easily extractable from the 
standard, non-defective isoline at the same stages [10]. However, 
the similar PRP2 protein was extractable from both standard and 
defective seed coats. These results implied that a major 
physiological event in the net pattern defective seed coats may 
be the irreversible cross-linking of PRPl into the cell wall 
occurring in the developing seeds. 



Transcription Factors Affected by the Net Pattern 

Mutation 

Transcription factors are important players for controlling the 
flow of genetic information from DNA to RNA and ultimately 
affecting the growth and physiology of the plant. In this study, 
there were approximately 240 differentially expressed transcrip- 
tion factor genes at different seed weight stages (Tables S2-S4). 
These genes were divided into different classes as presented in 
Figures 7, S16, S17, and S18. The major classes of transcription 
factors genes which showed differential expression in our study 
were Myb, NAC, basic helix-loop-helix, WRKY, and zinc finger 
transcription factors. Transcription factors of these types been 
reported to affect cell wall integrity in other plant species. MYB 
domain containing transcription factors are involved in secondary 
ceU wall biosynthesis, pollen wall composition, mucilage deposi- 
tion, extrusion and hgnin deposition in Arabidopsis [49,50,51,52]. 
The MYB46 and MYB83 transcription factors are thought to 
regulate secondary wall biosynthesis in Arabidopsis [53,54]. A 
basic helix-loop-helix transcription factor had been reported to 
play an important role in tapetal cell development [55]. The role 
of a MADS-box transcription factor as RIPENING INHIBITOR 
(RIN) is in ceU waU metabohsm and carotenoid biosynthesis [56] 
and a basic leucine zipper domain containing transcription factor 
had been reported to affect pollen coat patterning [57]. As 
demonstrated by Wang et al, WRKY transcription factors are 
involved in regulation of downstream genes encoding the NAM, 
ATAFl/2, and CUC2 (NAC) and CCCH type (C3H) zinc finger 
TFs that activate secondary wall synthesis [58]. NAC domain 
containing transcription factors were reported to be involved in 
secondary cell wall biosynthesis and in the regulation of cellulose 
and hemicellulose biosynthetic genes in addition to those involved 
in hgnin polymerization and signaling in Arabidopsis and 
Eucalyptus [59,60]. Rice and maize SWNs (Secondary wall 
NACs) and MYB46 had been reported as master transcriptional 
activators of the secondary wall biosynthetic program [61]. The 
AP2 domain containing family is an ethylene responsive group of 
transcription factors, and in Arabidopsis thaliana, these were 
expressed in specific cell types of roots, stems and seeds that 
undergo suberization [62]. The TTl genes were another class of 
the significantly differentially expressed genes in this study 
(Figure 6). These transcription factor genes had been reported to 
be involved in seed coat pigmentation and integrity in Arabidopsis 
[63]. 

It is likely that the defective seed coat mutation in soybean sets 
in motion changes in many pathways related to the cell wall. Some 
of the differential expression may be a temporal shift in 
developmental app(;aranci's of transcripts. Notably, in the case 
of the TTl homolog, it appears that the decline in its transcript 
levels was delayed in the defective seed coats relative to the 
standard seed coats (Figure 6), thus leading to significant 
differential expression at the 100-200 mg seed weight range-. This 
profile is similar to the pattern exhibited by the PRPl protein 
(Figure 4A) in that its decUne in expression appears to be slowed. 
Thus, a developmental delay in the normal expression patterns of 
multiple genes appeared to be a feature of the effect of this 
mutation as shown in Figures S6-S10 and S14— S15. These shifts 
may be one of the reasons for significant differential expression at 
the mid-seed weight range of 100-200 mg and may indicate that 
the triggering event occurs at or before the 50 mg seed weight 
range. 

Summary 

We present an overview of genes whose expression was affected 
by the "net pattern" mutation in soybean. Two isolines of different 
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genetic backgrounds were used to understand physiology related 
to this mutation that causes defective cracks in the seed coats. 
Twelve sampk-s representing three stages of seed coat develop- 
ment from each of the four lines were subjected to the power of 
next-generation sequencing to obtain millions of transcript reads 
that were mapped to all 78,773 high and low confidence gene 
models of Glycine max. There were approximately 1300 significandy 
differentially expressed genes affected by this mutation in each 
isoline pair with 364 in common between the two isolines of 
different genetic backgrounds. The cell wall structural protein 
genes including proline-rich proteins (PRPl and PRP2) and 
glycine-rich proteins were among the transcripts affected by the 
mutation as well as a number of transcription factors. Some of the 
transcript patterns indicate that a developmental shift in timing of 
gene expression underpins the differential gene expression changes 
at mid-maturation as shown by graphical presentation of 
expression patterns for 82 out of 364 genes that were significandy 
differentially expressed in both the Clark and Harosoy back- 
grounds. In summary, the mutation appears to set in motion a 
complex series of events, many manifested at the transcript level, 
that lead to changes in physiology and ultimately structure of the 
cell wall. The information on potential candidate genes for this 
mutation and the regulated genes will not only be helpful in 
understanding cell wall physiology in soybean, but could also 
provide a platform for improvement in bioenergy crops like 
sorghum, maize and Miscanthus. 

Methods 

Plant Materials and Genetic Nomenclature 

The isolines of Glycine max used in this study were a black seeded 
Clark line, Clark defective seed coat, Harosoy standard and 
Harosoy defective seed coat. The defective seed coat mutation was 
originally found in two PI (plant introduction) lines, PI 339.994 
and PI423.730B and these lines were backcrossed to the black 
seeded Clark line (UC9) to create an isoline UC412 containing this 
mutation. The same mutation was also backcrossed into the 
Harosoy line to create an isoline UC508. The L or PI numbers 
represent the official USDA isoline or plant introduction numbers, 
respectively. The UC number is an internal number used by our 
laboratory. Detailed information on these lines is presented in 
Table 1. 

RNA-Seq Method 

For RNA extraction, immature seeds were harvested over the 
course of several weeks. The individual seeds were pooled and 
sorted by weight. Following this, each seed was dissected, 
separating the cotyledon and embryo from the seed coat, and 
placed in separate 15-ml polypropylene centrifuge tubes (Corning, 
Acton, MA). All tissue was then frozen in liquid nitrogen and 
placed in storage at — 80°C until it could be lyophilized. RNA of 
seed coats of seeds from the 50-100 mg, 100-200 mg and 400- 
500 mg seed weight stages of each isoline was extracted separately 
using the RNA for 5 ml volumes from ~30 mg (50—100 mg and 
100-200 mg seed weight stage) and ~70 mg (400-500 mg seed 
weight stage) dry weight. The modified protocol used here is based 
on the protocols of McCart)' [64] . Extraction of RNA from seed 
coats with dark pigmentation requires the use of a modified 
extraction protocol which prevents procyanidins from binding the 
RNA [65,66]. RNA was extracted from 200 mg of freeze-dried 
seed coats by using phenol chloroform extraction and lithium 
chloride precipitation supplemented with PVPP (polyvinylpyrrol- 
idone), polyproline and BSA (bovine serum albumin). Library 
construction and high-throughput sequencing was carried out 



using RNA-Seq technology using lUumina Gall and HiSeq2000 
instruments by Keck Center, University of lUinois. 

RNA-Seq Alignment and Data Normalization 

The 75 bp and lOObp reads were mapped to the 78,773 Glyma 
cDNA gene models (JGI/Phytozome) using Bowtie [1 1] with up to 
3 mismatches allowed and up to 25 alignments. The total number 
of generated raw reads for Clark standard and defective isolines 
and for Harosoy isolines at three different stages of seed 
development is presented in Table 1. RNA-Seq data were 
normalized in reads per kilobase of gene model per million 
mapped reads (RPKM), as the RNA-Seq depends on the 
transcript length since the reads wiU map to all positions of the 
transcript [18]. 

Annotation of glyma gene models. Coding region gene 
models were collected from the mask(;d soybc^an genome from 
Phytozome version 6.0 and the soybean genome version 1.0 GFF 
file of the soybean genome [12] which is available at SoyBase [67]. 
In addition to the PFAM annotations that were downloaded from 
Phytozome, the 78,773 models (that include both high and low 
confidence models) were further annotated using BLASTX against 
the non-redundant (nr) database of the National Center for 
Biotechnology Information (NCBI), and for trEMBL and Swiss 
prot of the European Bioinformatics Institute (EBI) as described by 
Hunt etal [68]. 

DESeq data analysis. DESeq (available via Bioconductor) is 
an R package to analyze count data from high-throughput 
sequencing data such as RNA-Seq and test for differential 
expression [69]. The basemean value and p values for each 
Glyma model were analyzed using this package for 78,773 Glyma 
models using comparison between Clark standard and defective or 
Harosoy standard and defective isolines. 

Northern Blots 

Total RNA was extracted from the frozen tissue including 4—5 
day old soybean hypocotyls minus their cotyledons and seed coats 
of 100-200 mg seeds from four lines using a standard phenol 
chloroform method with lithium chloride precipitation [64] or the 
modified method for pigmented tissues [65,66]. RNA samples 
were quantified by spectrophotometer and the integrity 
was confirmed using agarose gel electrophoresis. RNA was stored 
at — 80°C until further use. For RNA gel blot analysis, 10 |a,g of 
total RNA was electrophoresed through 1.2% agarose/ 1.1% 
formaldehyde gels [70] blotted onto nitrocellulose membranes 
(Schlei[:her & Schuell, Keene, NH) via capillar)- action with 10 x 
SSC (1.5 M NaCl and 0.15 M sodium citrate, pH - 1) overnight. 
After blotting, RNA was cross-linked to the nitrocellulose 
membranes with UV radiation by a UV cross-linker (Stratagene, 
La JoUa, CA). Nitrocellulose RNA gel blots were then prehybrid- 
ized, hybridized, washed, and exposed to Hyperfdm (Amersham, 
Piscataway,NJ) as described by Todd and Vodkin [14]. An 
ADR12 gene EST was (Gm-r 1088-6964) used as probe and 
labeled with [a-'*^P]dATP by random primer reaction method 
[71]. 

Accession Numbers 

The data have been entered into Gene Expression Omnibus at 
that National Center for Biotechnology Information as Accession 
Series GSE54903. 

Supporting Information 

Figure SI The Distribution of Expression Levels in RPKMs of 
Differentially Expressed Genes in the Clark Isoline Pair at Three 
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DifiFerent Seed Weight Stages. (A) 50-100 mg in Clark Standard 
(B) 50-100 mg in Clark Defective (C) 100-200 mg in Clark 
Standard (D) 100-200 mg in Clark Defective (E) 400-500 mg 
Clark Standard (F) 400-500 mg Clark Defective. 
(TIFF) 

Figure S2 The Distribution of Expression Levels of Differen- 

tiall)' Expressed Genes in the Harosoy Isoline Pair at Three 
Different Seed \V(;iglit Stages. (A) 50-100 mg in Harosy Standard 
(B) 50-100 mg in flarosoy Defeetive (C) 100-200 mg in Harosoy 
Standard (D) 100-200 mg in Harosoy Defective (E) 400-500 mg 
Harosoy Standard (F) 400-500 mg Harosoy Defective. 
(TIFF) 

Figure S3 The Nucleotide and Amino Acid Sequence Align- 
ment for PRPl. (A) Alignment of PRPl nucleotide sequence from 
NCBI J02746) and PRPl sequence (Glyma09gl2200. 1) from 
Phytozome database (B) Alignment of PRPl amino acid sequence 
from NCBI with PRPl amino acid sequence from Phytozome 
database. The nucleotide and amino acid sequence from Williams 
(Phytozome) and Wayne (NCBI) cultivars showed 100% sequence 
homology. 
(TIFF) 

Figure S4 The Nucleotide and Amino Acid Sequence .Align- 
ment of the PRP2 Gene. (A) Alignment of PRP2 nucleotide 
sequence from NCBI (J05208) and PRP2 sequence from 
Phytozome database (Glyma09gl2260.1) (B) Alignment of PRP2 
amino acid sequence from NCBI with PRP2 amino acid sequence 
from Phytozome database. There is difference of one tandem 
repeat in the amino acid sequence. The PRP2 protein is shorter in 
Williams (Phytozome) as compared to Wayne (NCBI) cultivar. 
(TIFF) 

Figure S5 The Genome Organization of PRPl and PRP2. As in 
the Phytozome database, PRPl and PRP2 are on chromosome 9 
and the distance between these genes is ~146 kb. 
(TIFF) 

Figure S6 The Expression Pattern of 13 Selected Differentially 
Expressed Genes in the Seed Coat of Wildtype and Defective 
Isolines in Both Clark and Harosoy Backgrounds. The genes over 
expressed in the defective isolines (Tables 3a, 4a and Table S3). 
CS: Clark Standard, CD: Clark Defective, HS: Harosoy Standard, 
HD: Harosoy Defective. 
(TIF) 

Figure S7 The Expression Pattern of 16 Selected Differentially 
Expressed Genes in the Seed Coat of Wildtype and Defective 
Isolines in Both Clark and Harosoy Background. The genes 
overexpressed in the Standard isolines (Tables 3b, 4b and Table 
S3). CS: Clark Standard, CD: Clark Defective, HS: Harosoy 
Standard, HD: Harosoy Defective. 
(TIF) 

Figure S8 The Expression Pattern of 1 1 Fasciclin-like Arabino- 

galactan Gene in the Seed Coat of Wildtype and Defective Isolines 
in Both Clark and Harosoy Background. These genes showed 
higher expression in defective isoline as compare to wildtype 
isoline (Table S3). CS: Clark Standard, CD: Clark Defective, HS: 
Harosoy Standard, HD: Harosoy Defective. 
(TIF) 

Figure S9 The Expression Pattern of Cell Wall Genes 
Overexpressed in the Seed Coats of Defective Isolines in Both 

Clark and Harosoy Background. These graphs present data for 25 
additional cell wall related genes not previously shown in Figure 
S6 and S7. Their RPKM and p value data are presented in Table 



S3. CS: Clark Standard, CD: Clark Defective, HS: Harosoy 

Standard, HD: Harosoy Defective. 

(TIF) 

Figure SIO The Expression Pattern of Cell Wall Genes 
Overexpressed in the Standard Seed Coats of Both Clark and 
Harosoy Background. These graphs present data for 3 additional 
cell wall related genes not previously shown in Figure S6 and S7. 
Their RPKM and p value data are presented in Table S3. CS: 
Clark Standard, CD: Clark Defective, HS: Harosoy Standard, 
HD: Harosoy Defective. 
(TIFF) 

Figure Sll The Distribution of Differentially Expressed Genes 
in the Seed Coats of Either (A) Clark or (B) Harosoy Backgrounds 
at the 50-100 mg Seed Weight Stage. The number of differen- 
tially expressed genes (>5RPKM, £2 fold differential expression, 
p-valueS0.05) was (A) 720 genes in Clark isolines and (B) 48 genes 
in Harosoy isolines. In both backgrounds, there were major 
categories related to the cell wall. 
(TIFF) 

Figure S12 The Distribution of Differentially Expressed Genes 
in Seed Coats of Either (A) Clark or (B) Harosoy Backgrounds at 
the 100-200 mg Seed Weight Stage. The number of differentially 

expressed genes (>5RPKM, >2 fold differential expression, p- 
value £0.05) was (A) 173 genes in Clark isolines and (B) 156 genes 
in Harosoy isolines at 100-200 mg seed weight stage. In both 
backgrounds, one of the major categories was related to the cell 
wall. 
(TIFF) 

Figure S13 The Distribution of Differentially Expressed Genes 
in Seed Coats of (A) Clark or (B) Harosoy Backgrounds at the 400- 
500 mg Seed Weight Stage. The number of differentially 
expressed genes (a5RPKM, >2 fold differential expression, p- 
value SO.05) was (A) 417 genes in Clark isohnes and (B) 1068 
genes in Harosoy isohnes. In both backgrounds, the cell weiU 
related genes occupied 5% of the chart area. 
(TIFF) 

Figure S14 The Expression Pattern of Differentially Expressed 
Transcription Factor Genes That Showed Higher Expression in 

the Seed Coat of Defective Isolines in Both Clark and Harosoy 
Background. Overexpressed in defective isolines. CS: Clark 
Standard, CD: Clark Defective, HS: Harosoy Standard, HD: 
Harosoy Defective. 
(TIF) 

Figure S15 The Expression Pattern of Differentially Expressed 
Transcription Factor Genes That Showed Higher Expression in 
the Seed Coat of Standard Isolines in Both Clark and Harosoy 

Ba(kground. Overexpressed in standard isolines. CS: Clark 
Standard, CD: Clark Defective, HS: Harosoy Standard, HD: 
Harosoy Defective. 
(TIF) 

Figure S16 The Distribution of Differentially Expressed Tran- 
scription Factor Genes into Different Classes in Clark at the 50- 
100 mg Seed Weight Stage. The 59 differentially expressed 
transcription factor genes (S5RPKM, a2 fold differential 
expression, p-value SO.05) were divided into 12 classes based on 
functional annotations. 
(TIFF) 

Figure S17 The Distribution of Differentially Expressed Tran- 
scription Factor Genes in (A) Clark or (B) Harosoy Backgrounds at 
die 100-200 mg Seed Weight Stage. (A) The 16 diflferentiaUy 
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expressed transcription factor genes (S5RPKM, £2 fold differ- 
ential expression, p-valueS.05) were divided into 9 different 
classes (B) The 12 differentially expressed transcription factor 
genes (>5RPKM, >2 fold differential expression, p-value <0.05) 
were divided into 10 different classes based on functional 
annotation. 
(TIFF) 

Figure S18 The Distribution of Differentially Expressed Tran- 
scription Factor Genes in Different Classes in (A) Clark and (B) 
Harosoy Backgrounds at the 400-500 mg Seed Weight Stage. (A) 
The 43 differentially expressed transcription factor genes 
5RPKM, >2 fold differential expression, p-value ^.05) were 
divided into 1 0 different classes (B) The 1 1 1 differentially 
expressed transcription factor genes (&5RPKM, &2 fold differ- 
ential expression, p-value £0.05) were divided into 15 different 
classes based on functional annotation. 
(TIFF) 

Table SI The Total Number of Expressed Genes in Seed Coats 
of the Different Isolines. The total number of expressed genes at S 
10 RPKM or 2:1 RPKM in different backgrounds at different 
seed weight stages are presented in this table. CS: Clark standard, 
CD: Clark defective, HS: Harosoy Standard and HD: Harosoy 
defective. 
(DOCX) 
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